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Abstract 

We provide a unifying axiomatics for Renyi's entropy and non-extensive entropy of Tsallis. 
It is shown that the resulting entropy coincides with Csiszar's measure of directed divergence 
known from communication theory. 
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1 Introduction 



It has been known already since Shannon's seminal paper [1] that Shannon's infor- 
mation measure (or enropy) represents mere idealized information appearing only in 
situations when the buffer memory (or storage capacity) of a transmitting channel is 
infinite. As the latter is not satisfied in many practical situations, information theo- 
rists have invented various remedies to deal with such cases. This usually consists of 
substituting Shannon's information measure with information measures of other types. 
Particularly distinct among them is a one-parametric class of information measures 
discovered by A. Renyi. It was later on realized by Linnik that these, so called, Renyi 
entropies (RE's) are associated to the decoding limit if the source is compressed to 
Xq and the parameter q essentially tells how much the tail of a probability distribu- 
tion should count in the calculation of the Renyi entropy. Recently an operational 
characterization of RE in terms of /3-cutoff rates was provided by Csiszar [2]. 

On the other hand, pioneering works of E. Jaynes [3] in mid 50's revealed that the 
Gibbs entropy of statistical physics represents the Shannon entropy whenever the sam- 
ple space of Shannon's entropy is identified with the set of all (coarse-grained) mi- 
crostates. However, contrary to information theory, tendencies trying to extend the 
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concept of Gibbs's entropy have started to penetrate into statistical physics just re- 
cently. This happened be realizing that there are indeed many situations of practical 
interest requiring more "exotic" statistics which do not conform with the classical 
Gibbsian MaxEnt. Percolation, polymers, protein folding, critical phenomena, cosmic 
rays, turbulence or stock market returns provide examples. 

One obvious way of generalizing Gibbs's entropy would be to look on the axiomatic 
rules determining Shannon's information measure. In fact, the usual axiomatics of 
Khinchin [4] offers various "plausible" generalizations. The additivity of independent 
mean information is then natural axiom to attack. Along those lines only two distinct 
generalization schemes have been explored in the literature so far. First consists of a 
redefinition of the statistical mean and second generalizes additivity rule. Respective 
entropies are then RE's [5] and various deformed entropies [6] . While RE's are natural 
tools in statistical systems with a non-standard scaling behavior, deformed entropies 
seem to be relevant to systems with embedded non-locahty. A suitable merger of the 
above generalizations could provide a new conceptual playground suitable for a statis- 
tical description of systems possessing both self-similarity and non-locality. Examples 
being the early universe cosmological phase transitions or currently much studied quan- 
tum phase transitions. In this paper we attempt to merge RE's and Tsallis entropies. 



2 Renyi's entropy: entropy of self— similar systems 

As already mentioned, RE represents a step towards more reahstic situations encoun- 
tered in communication theory. Among a myriad of information measures RE's discern 
themselves by a firm operational characterization given in terms of block coding and 
hypotheses testing. Renyi parameter q then represents the so-called /5-cutoff rates [2]. 
RE of order g (g > 0) of a discrete distribution V = {pi, . . . ,Pn} reads 



Apart from coding theory RE's have proved to be indispensable tools in various branches 
of physics. Examples being chaotic dynamical systems or multifractals. In his original 
work Renyi [5] introduced a one-parameter family of information measures (=RE) 
which he based on axiomatic considerations. In the course of time his axioms have 
been sharpened by Darotzy [7] and others [8]. Most recently it was proved in Ref. [9] 
that RE can be conveniently characterized by the following set of axioms: 

(1) For a given integer n and given V = {pi,P2, ■ ■ ■ ,Pn} {Pk > 0, YlkPk = 1); ^CP) is 
a continuous with respect to all its arguments. 

(2) For a given integer n, X(pi,p2, • • • ,Pn) takes its largest value for p^ — 1/n {k — 
1, 2, . . . , n) with the normalization 1 \\,\) = In 2. 





(1) 
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(3) For a given g G M define ()k{(l) = {PkY / Hk{PkY ("^ is affiliated to A) then 
l{AnB)=l{A)+l{B\A) where X{B\A) ^ g-^ {Y^k Qk{.q)g{AB\A ^ Ak))). 

(4) g is invertible and positive in [0, oo). 

(5) X(pi,p2,---,Pn,0) = J(pl,p2,•••,Pn)• 
Former axioms markedly differ from those utilized in [5,7,8]. One particularly distinct 
point is the appearance of the escort distribution Q{q) in axiom 3. Note also that RE 
of two independent experiments is additive. In fact, it was shown in Ref. [5] that RE 
is the most general information measure compatible with additivity of independent 
information and Kolmogorov axioms of probability theory. 



3 Tsallis entropy: entropy of long distance correlated systems 



Among variety of deformed entropies the currently popular one is the g-additivity 
prescription and related TsaUis entropy (TE). As the classical additivity of independent 
information is destroyed in this case, a new more exotic physical mechanisms must be 
sought to comply with TE predictions. One may guess that the typical playground for 
TE should be cases when two statistically independent systems have non-vanishing 
long-range/time correlations: e.g., statistical systems with quantum non-locality. In 
the case of discrete distributions V — {pi, . . . TE takes the form: 



(2) 



lk=\ 



Axiomatic treatment was recently proposed in Ref. [10] and it consists of four axioms 



(1) For a given integer n and given P = {pi,P2, • • • ,Pn} ipk > 0, YlkPk = 1), is 
a continuous with respect to all its arguments. 

(2) For a given integer n, S{V) takes its largest value for = 1/n {k — 1, 2, . . . , n). 

(3) For a given q e R; S{A B) = S{A) + S{B\A) + (1 - q)S{A)S{B\A) with 

SiB\A) = Ek Qk{q) S{B\A = Ak). 

(4) 5(pi,P2,---,Pn,0) =5(pi,P2,---,Pn)- 

As said before, one keeps here the linear mean but generalizes the additivity law. In 
fact, the additivity law in axiom 3 is nothing but the Jackson sum of the g-calculus. 



4 J-A cLxioms and solutions 



Let us combine the previous two axiomatics in the following natural way: 
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(1) For a given integer n and given V = {pi,P2, ■ ■ ■,Pn} {Pk > 0, E^Pfe = 1), ^(^) is 
a continuous with respect to all its arguments. 

(2) For a given integer n, V(V) takes its largest value for = 1/n {k = 1, 2, . . . , n). 

(3) For a given g e R; V{A n B) = V{A) + V{B\A) + (1 - q)V{A)V{B\A) with 
V{B\A) = (Efc ^fc(g) / = Afe))). 

(4) / is invertible and positive in [0, oo). 

(5) V{pi,p2,...,Pn,0) ^V{pi,p2,...,Pn)- 

We will now show that the above axioms allow for only one class of solutions which 
will be closely related to the cross-entropy measures of Havrda and Charvat [11]. 



5 Bcisic steps in the proof 



Let us first denote T){l/n,l/n, . . . ,1/n) = C{n). Axioms 2 and 5 then imply that 
£(n) = V{l/n,...,l/n,0) < V{l/n + 1, . . . ,1/n + 1) = C{n + 1). Consequently 
>C is a non-decreasing function. To determine the form of C{n) we will assume that 
> are independent experiments each with r equally probable outcomes: 



= D(l/r,...,l/r) =£(r), (1 < A; < m) . (3) 
Repeated application of axiom 3 then leads to 



^ 'A + {l-q)£{r)r-l]. (4) 



Taking partial derivative of both sides of (4) with respect to m and putting m — 1 
afterwards we get the differential equation 



{1- q) dC _ dr 



(l + (l-g) £)[ln(l + (l-g) £)] rlnr' 
It is easy to verify that the general solution of (5) has the form 

£(r) = C,{r) = ^ (r<^^ - l) . (6) 

Function c(g) will be determined later on. Right now we just note that because at 
g = 1 Eq.(4) boils down to i2(r™) = mC{r) we have c(l) = 0. We proceed by con- 
sidering the experiment with outcomes A — {Ai, A2, ■ ■ ■ , An) and the distribution 
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V — {pi,p2, ■ ■ ■ ,Pn)- Assume moreover that Pk{^ ^ k < n) are rational numbers, 
i.e., pk = Qk/g, J2k=i9k = 9 with G N. Let us have, furthermore, an experiment 
B = {Bi, B2,..., Bg) with distribution Q = {gi, ^2, • • • , Qg}- We spht {Bi, B2, ■ ■ ■ , Bg) 
into n groups containing gi, g2, . . . , gn outcomes respectively. Consider now a particular 
situation in which whenever event Ak happens then in B all gk events of A;-th group 
occur with the equal probability 1/gk and all the other events in B have probability 
zero. Hence VlBlA — Ak) — T^i^/gk, ■ ■ ■ A/ 9k) — C.qiSk) and so by axiom 3 we have 



V{B\A) = (^g gk{q)f{jO,{gk))j . (7) 

On the other hand, for our system the entropy V{A H B) can be easily evaluated. 
Realizing that the joint probability distribution corresponding to ^ fl B is 



-O / \ rPl Pi P2 P2 Pn Pn. r , , . 

^= \rki^Pkqi\k \ = {—,...,—,—,...,—,...,—,...,—} = {1 g,...,l g} , 

^ 9l 9l 92 92 9n 9n 

^ V V ' ^ V ' 

51 X S2 X g„ X 

we obtain that V{A Ci B) = J0.q{g). Utilizing the first part of axiom 3 and defining 
f(a,y){x) = f{~o,x + y) we can write 



1 - (1 - q)f{a,C{g)) (J2k Qk{q)f{a,C{g)){-^q{Pk))) 



As Eq.(6) indicates it is jCq{l/pk) and not —Cq{pk) which represents the elementary 
information of order q affiliated with p^- Thus using the relation 



together with transformation 

we easily obtain that 

nA) = g-' Qk{q)g{Cg {l/pk))^ = (E Qk{q)f{Cg (l/p,))) . (10) 
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The last identity is due to second part of axiom 3. It is well known from the theory of 
means [12] that (10) can be fulfilled iff g{x) is a linear function of f{x), i.e., 



-X + y 
1 + (1 - q)x^ 



In order to solve (11) we define (p{x) — f{x) — /(O). With this notation (11) turns into 



( i+(i-g)x ) ^ ^'^y^'^^''^ + '^^y^ ' '^^^^ ^ ° ■ ^^^^ 

By setting x = y we see that 9q{y) = —1, hence one finds 

ip{x + y + {1 - q)xy) = (p{x) + ip{y) . (13) 

Eq.(13) is Pixeder's functional equation which can be solved by the standard method 
of iterations [13]. Eq.(13) has only one non-trivial class of solutions, namely: 



^{x) = ln{l + {l^q)x) . (14) 
1 — a 

a is here a free parameter. Plugging this solution back to (10) we obtain 



1 — ? ^ ~ Q \ k J 



(15) 



Note that the constant a got cancelled. It remains to determine c(g). Utilizing the 
conditional entropy constructed from (15) and using axiom 3, we obtain c{q) = 1 — q. 
Inasmuch we can recast (15) into more expedient form 



T^M) = L-^'-'^)''^^/''^ j^iPkY - 1 J . (16) 
^ ^ \ k=i I 

Eqs.(15)-(16) are the sought results. In passing we note that > for Vg e M and 
limg^iPq =Zx = Sx. 

6 Conclusions and outlooks 

Presented axiomatics might provide a novel playground for a q non-extensive systems 
with embedded self-similarity. Indeed, one could expect that the obtained measure 
of information could play a relevant role in q non-extensive statistical systems near 
critical points. Research in this direction is currently in progress. 
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A curious result arises when one restricts values of V by the constraint dXq{V) / dq 
maXp^Xg(P)/(l — q). Eq.(16) then boils down to 



V,{A) = -— U-'Y.(puy-l]^-C,{A). (17) 

y \ fc=i / 

The reader may recognize in Cq the generalized measure of cross-entropy of Havrda 
and Charvat [11] (also known as Csiszar's measure of directed divergence [14]) used 
in communication theory. For g = 2 we recover the measure. This suggests that 
the non-extensivity together with self-similarity may be important concepts also in 
information theory. In this connection such issues as the channel capacitance and cutoff 
rates would deserve a separate discussion. 

The generalized entropy Vq has many desirable features: like Tsallis entropy it satisfies 
the non-extensive g-additivity, involves a single parameter g, and goes over into the 
standard Shannon entropy in the limit g — 1. On that basis it would appear that both 
Sq and Vq have an equal right to furnish a generalization of statistical mechanics. 
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